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Abstract 

The recently proposed close-packed motif for collagen is investigated using first principles 
semi-empirical wave function theory and Kohn-Sham density functional theory. Under these 
refinements the close-packed motif is shown to be stable. For the case of the 7/2 motif a similar 
stability exists. The electronic circular dichroism of the close-packed model has a significant 
negative bias and a large signal. An interesting feature of the close-packed structure is the 
existence of a central channel. Simulations show that, if hydrogen atoms are placed in the cavity, 
a chain of molecular hydrogens is formed suggesting a possible biological function for molecular 
hydrogen. 
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In this Letter, we consider the close-packed (CP) structure for collagen [I] and subject 
this motif to first principles calculations and compare with calculations for the 7/2 motif. 
The motivation behind the CP structural motif is the remarkable coincidence that it is 
optimally packed while at the same time having a vanishing coupling between strain and 
twist [2J. While the 7/2 and 10/3 structural motifs are supercoiled triple helix structures, the 
CP structure is akin to that of a rope where the three polypeptide strands are intertwined. 
The CP geometry is the non-periodic triple helix which optimizes the volume fraction, it 
has a helical pitch of 20 A. 

Hitherto, the 10/3 structure suggested by Ramachandran, and Rich and Crick [3H6] and 
the 7/2 structure suggested by Okuyama et al. [2 [8] have been the most studied motifs 
for collagen [9j and collagen- like peptides [TOj [11]. These two motifs belong to the same 
symmetry class: three left-handed helical polypeptide chains are supercoiled in a right- 
handed arrangement. For long helical structures the relatively short range of the strand 
and inter-strand interactions does not favor commensuration, i.e. that an integer number of 
residues on the polypeptide strands corresponds to a 27r supercoiling. Hence, in general the 
structure will not be periodic. For the periodic 10/3 structure it takes ten residues to make 
three full 2tc rotations to complete the ~ 85 A unit cell while for the periodic 7/2 structure 
it takes seven residues of one polypeptide chain to complete two full 2ir rotations which 
complete a unit cell length of about 60 A. A contributing factor to the original suggestion 
to change focus from the 10/3 structure to the 7/2 structure was diffraction spots in x-ray 
patterns corresponding to a longitudinal period of 20 A which were unaccounted for by the 
10/3 structure P2J. 

In high resolution crystallographic studies of peptide-like structures the question of 
whether the 7/2 or the 10/3 structure describe the data, or not, is not always resolved. 
R-factors indicate that both structures describe features of the diffraction data jTOJ 1131416] . 
Therefore atoms have been assigned differently to the three strands and hence the assignment 
of the atoms to the individual strands remains uncertain. And, perhaps a third structure 
such as the CP structure would be a better overall description which could account for some 
of the experimental data not yet accounted for by either the 7/2 or the 10/3 structures. 

Density functional theory (DFT) studies of various collagen-like peptides is an active re- 
search area [lTlUE] including the interesting issue of studying biomineralization [19]. For the 
initial configurations of the first principles calculations we used the approximate coordinates 
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for the CP structure obtained by simple geometrical methods [T] and for the 7/2 structure 
the coordinates obtained from the PDB depository (entry: 1K6F 3[Pro-Pro-Gly]io) [13J. 
These structures were then truncated to various lengths (3[PPG] 6 , 3[PPG] 3 and 3[PPG]i) 
and hydrogens were added with the Insightll program from Biosym Technologies, Inc. To 
treat the effects of an aqueous environment, we have utilized the polarized continuum model 
(PCM) as implemented in the Gaussian 09 program [20TT22"] . 

The semi-empirical wave function theory (WFT) PM6 model of Stewart [23] and two 
Kohn-Sham density functional theory (KS-DFT) exchange correlation functionals, the CAM- 
B3LYP [2U |25] and the LC-wPBE [26] have been used in this work. For the larger model, 
3(PPG)6, only the PM6 model was used for geometry optimization. 

Circular dichroism is commonly used to distinguish structures with various degrees of 
chirality e.g. a-helices, /3-sheets, etc. The electronic circular dichroism (ECD) spectra were 
simulated at the CAM-B3LYP time dependent DFT (TD-DFT) level of theory for only 
the two smaller models, the 3(PPG)3 and the 3(PPG)i systems. The 81 lowest energy 
singlet transition energies were calculated along with the corresponding dipole strengths 
and rotational strengths in both the length and velocity formalisms for both 7/2 and CP 
3(PPG)3 structures, and the 27 lowest singlet transition energies were calculated along with 
the corresponding dipole strengths and rotational strengths in both the length and velocity 
formalisms for both CP and 7/2 3(PPG)i structures. 

In Fig. 1 the PM6/PCM CP and 7/2 structures for 3(PGG)6, top and side views are 
depicted for both systems. As one can see in Fig. 1(a), there is a visible open channel in the 
CP structure surrounded by oxygen atoms. In Tables [I] and [II] we present the backbone and 
side chain torsional angles for all of the residues for the three chains for the CP structures 
3(PGG)3 and 3(PGG)g. In the Supplementary Material we present the backbone and side 
chain angles for the complete 3(PGG)6 CP structure as well as its corresponding xyz file. 

In our analysis, the CP and 7/2 structures have nearly the same energy. The PM6 energy 
difference between the 7/2 structure and CP structure for collagen is E(7/2) — E(CP) = 
-1.739 eV, or -0.065 eV per residue for 3(PGG) 3 , and £(7/2) - E(CP) = -29.58 eV, or 
—0.37 eV per residue for 3(PGG)6- This higher energy difference for the longer structures 
reflects the difficulties that the applied methods have in properly optimizing long helical 
structures. 

Collagen has also been studied by measurements of circular dichroism (CD) spectra [27T 
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TABLE I: PM6/PCM 3[PPG] 3 CP structure: backbone and side chain angles (deg) 



Torsion angles chain 1 chain 2 chain 3 



4> (C 5 NC a C ) 


-62 


-oO 


-63 


^i(NC a C'N) 


136 


1 A 1 

141 


150 




-176 


177 


169 


/- /AT/~1 /"I /~1 \ 

Ci,i(NC a C^C 7 ) 


-7 


-14 


-15 


/- f r~i r~\ r*\ r^i \ 


14 


19 
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/- / /'"I 1^1 /-I AT\ 

Ci,3(C,aC 7 C5N) 


-16 


-17 


-20 




Iz 


9 


1 1 
11 


2 (C NC Q C ) 


-73 


-71 


no 

-92 


V>2(NC a C N) 


148 


141 


151 


cj 2 (C q C NC Q ) 


1 '"70 

172 


179 


-17b 


>- / TV T V~"N \ 

C 2 ,i(NC Q C^C 7 ) 


14 


14 


21 


£2, 2(C Q C / gC 7 C,5) 


-14 


-11 


-18 


6, 3 (C^C 7 C 5 N) 


10 


5 


8 


e 2 ,4(C 7 C 5 NC Q ) 


-1 


4 


6 


3 (C'NC a C ! ) 


-76 


-95 


-117 


^ 3 ,i(NC a C'01) 


-59 


109 


58 


^3,i(NC Q C02) 


169 


179 


173 



[29] . Figure 2 shows the calculated ECD spectra of the CP and 7/2 motifs along with 
experimental data. The experimental ECD spectra is reproduced from Ref. [29]. The data 
shows a characteristic negative peak at around 200 nm and a smaller positive peak at 220 
nm. For the CAM-B3LYP /KS-TD-DFT/PCM electronic CD (EDC) calculated spectra for 
the PM6/PCM 3(PGG)3 structures the semi-empirical PM6 WFT was used for the geometry 
optimizations and KS-TD-DFT with the CAM-B3LYP exchange correlation (XC) functional 
was used for the ECD calculations. The EA and ECD spectra were all simulated with 
FWHM line widths of 15 nm. As one can see by comparison, neither of the two calculations 
describe the spectra significantly better than the other. The 7/2 model is somewhat better 
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TABLE II: PM6/PCM 3[PPG] 6 CP structure: Average backbone and side chain angles 

(deg) for the two middle PPG groups. 



Torsion angles chain 1 chain 2 chain 3 



A3 ( " /~i Mr' ) \ 


-zl 


-26 


-27 


^i(JNC a C JNJ 


1 ro 


1 Qf\ 

loU 


1 / 




1 '10 

-17b 


177 


1 '1*1 
111 




1 n 
1U 


r- 



1 O 


r /"i \ 
?1, 2(^0^/3 W^<5j 


-32 


-28 


00 

-00 


chM^W^J-NJ 


A 1 

41 


1 n 

4U 


/1 1 
41 


?l,4(^7WJM^aj 


-3d 


-38 


-OO 


jL / i at /^i ? \ 


12 


t-t 
( 


13 


^2(JNC„C JNJ 


1 A £ 

14b 


141 


15U 


W2(C Q C'NC a ) 


-17b 


-172 


-179 


e2,i(NC a C^C 7 ) 


-44 


-48 


-45 


£2, 2(00,0^0^,05) 


27 


3b 


29 


e2, 3 (C^C 7 C 5 N) 





-11 


-3 


e2, 4 (C 7 C 5 NC Q ) 


-27 


-18 


-25 


3 (C'NC a C) 


-94 


-75 


-93 


^ 3l i(NC a C01) 


-130 


-124 


-126 


^3,i(NC a C02) 


-170 


-17b 


-176 



at predicting the positive peak around 220 nm. We have observed that the detailed shape 
of the calculated spectra is sensitive to the exact values of the refined coordinates and hence 
we must assume some level of uncertainty simply due to the accuracy of the methods. The 
integrated intensities of the spectra (Ref. [29J ) , of the CP, and of the 7/2 are —8.4 x 10~ 2 , 
—9.2 x 10 -2 , and —2.9 x 10~ 2 degrees-cm 3 -dmol _1 , respectively. Hence, the CP calculation 
is more in agreement with regards to the negative bias. The reason for this is that the 
CP structure does not have the right-handed super coil geometry that reduces the overall 
left-handed chirality of the 7/2 structure. 
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Surprisingly, the central channel of the CP structure seen in Fig. 1(a) opens the possibility 
for collagen to hold, intercalate, and transport small molecules. In Ref. pQ, the diameter 
of the channel was estimated to be about 2 A. In Fig. 3 we depict the minimization of 
the 3 (PPG) 6 structure with four H2 molecules in the channel. First we placed eight atomic 
hydrogens in the channel with an interspacing distance of about 3.5 A, the purpose was 
to see if the hydrogens would interact with the oxygens of the collagen strands, or if they 
would interact with each other to form molecular hydrogen? The result was that molecular 
hydrogens were formed. It is a demonstration of the relative stability of the CP structure 
that it is adaptable to such changes. 

In this Letter we have reported the fascinating result that triple helix structures for 
collagen, which fulfill the close-packing principle, are atomically possible and chemically 
plausible according to the performed minimization. Whether this is a physical result or 
an effect of the applied minimization algorithms remains to be seen. First principles semi- 
empirical WFT and KS-DFT calculations of the structures and properties of collagen and 
compared the close-packed motif (CP) with the supercoiled motif (7/2) have been performed. 
It is shown that within the method applied both structures behave as stable structures, and 
both have roughly the same energy. Explicit solvent molecules can not only change the 
relative energy of various conformations peptides, but they can even stabilize conformers 
and species that are not stable either in the gas phase or using the simple continuum solvent 
models like PCM [20]. Complete features of the ECD spectra could not be described by 
neither of the two calculations while the negative bias is well described by the CP structure. 
Atomic hydrogens placed in the channel combine to form molecular hydrogen, and may 
perhaps contribute to enhance the stabilization of the close-packed structure. The CP 
structure with its central cavity appears to be a possible hydrogen molecule channel protein 
as atomic hydrogen can form stable molecular hydrogen in this channel. It is an intriguing 
possibility that such H 2 molecules will have a biological significance and function. 

This work is supported by the Villum Foundation. KJJ wishes to thank DKFZ and DTU 
for hospitality. 
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(c) (d) 

FIG. 1: PM6/PCM CP and 7/2 structures for collagen-like peptides 3(PPG) 6 . Panel (a) 
top view of CP structure, (b) top view of the 7/2 structure, (c) side view of the CP 
structure, and (d) side view of the 7/2 structure. 
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FIG. 2: Electronic circular dichroism CAM-B3LYP/PCM for CP (smooth blue curve) and 

7/2 (smooth dotted red curve) structures and ECD M06/PCM for CP structure of 
3(PPG)3. The experimental collagen data (black curve) are replotted from Ref. [29] after 
digitizing. Mean residue ellipticity (MRE) is measured in units of degrees- cm 2 -dmol -1 . 
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(a) (b) 

FIG. 3: (a) A stick and ball model of the close-packed 3(PGG)6 structure with four H 2 
molecules in the central channel, (b) Expanded view of the part of the structure containing 
the H 2 molecules. The vertical distance between individual H 2 molecules is about 6 A. 
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TABLE III: Supplementary Material A: PM6/PCM 3(PPG) 6 CP structure: backbone and 

side chain angles (deg) 



Torsion angles 


chain 1 chain 2 chain 3 


Torsion angles 


chain 1 chain 2 chain 3 


Torsion angles 


chain 1 chain 2 chain 3 


Proi 

r(Hl+NC a C) 

r(H2+NC a C') 

r(C s NC a C) 

V>i(NC Q C'N) 

wi(C«C'NC Q ) 


-102.20 -79.80 -108.81 
16.15 39.02 9.55 

134.31 156.42 128.63 
-165.33 169.54 168.12 

166.95 172.73 171.74 


Pro7 

^(C'NCoC) 
</> 7 (NC a C'N) 
w 7 (C a C'NC Q ) 


-23.87 -28.63 -20.98 
171.73 -170.97 170.33 
-178.22 -170.14 -178.33 


Pro i3 

<MC«NC„C) 

</>i 3 (NCoC'N) 

w 13 (C Q C'NCo) 


-38.34 -15.81 -20.26 
172.83 176.73 177.29 
-171.90 -171.79 -172.24 


Sl.lCNCaC^^) 

ei,3(C^C 7 CiN) 
a,4(C 7 C 5 NC a ) 


19.51 -11.33 21.01 
-36.67 -15.24 -41.10 

38.64 35.94 44.98 
-27.16 -43.32 -32,20 


57,i(NC a C^C 7 ) 

C7,2(c Q c^c 7 c^) 

«7, 3 (C^C 7 C 5 N) 

^(C-yCjNCc) 


12.22 14.87 6.60 
-33.72 -35.79 -29.24 

42.07 42.65 40.79 
-36.33 -35.26 -38.38 


6 3 ,i(NC a C^C 7 ) 

£l3,2(C a CgC 7 C,5) 
Sl3,3(C^C 7 QN) 

ei3,4(C 7 CjNC a ) 


15.47 7.64 6.63 
-35.25 -30.31 -28.29 

41.38 41.59 39.15 
-33.30 -38.43 -36.69 


Pro 2 

4> 2 {C'NC a C) 
V> 2 (NC Q C'N) 

W2(C a C'NC a ) 


3.52 -10.84 -8.64 
151.50 145.37 126.67 
-167.34 -153.78 -168.18 


Prog 

4> 8 (C'NC a C') 
^(NCoC'N) 
wsCCo.C'NCc) 


5.59 1.91 12.53 
154.23 146.41 153.71 
-176.20 172.85 179.19 


Proi4 

•MC'NCaC') 

V>i4(NC a C'N) 
w 14 (C Q C'NC a ) 


-9.84 6.93 5.04 
151.61 139.91 142.65 
-178.61 -175.10 172.46 


6,3(C^C 7 C (S N) 
6,4(C 7 QNC a ) 


-47.54 -36.19 -29.67 
40.41 26.71 13.83 
-18.32 -7.10 7.11 
-11.59 -15.98 -26.25 


«8,i(NC a C^C 7 ) 

§8,2(C Q C^C 7 Ci) 

«8, 3 (C^C 7 C 5 N) 
?8,4(C 7 C,NC„) 


-45.77 -48.13 -43.94 
32.92 40.85 27.95 
-8.89 -19.92 -2.64 

-19.69 -10.12 -24.67 


£i4,i(NC a CVC 7 ) 

€l4,2(C a C/3C 7 Ci) 

Ci4, 3 (C^C 7 C 5 N) 
€i4,4(C 7 CaNC a ) 


-42.93 -44.10 -48.58 
38.22 31.49 42.13 

-20.43 -8.56 -21.55 
-6.37 -18.71 -8.71 


Gly 3 

3 (C'NC a C) 
V> 3 (NC Q C'N) 
w 3 (C a C'NC a ) 


-96.50 -95.00 -90.36 
-123.41 -131.84 -134.10 
-175.02 -179.96 -175.81 


Gly 9 

^(C'NCcC) 

V>9(NC a C'01) 

w 9 (C a C'NC a ) 


-99.80 -75.67 -86.24 
-120.41 -120.49 -135.51 
174.76 -171.84 -170.89 


Glyis 

^i 5 (C'NC a C) 

</>is(NCoC'01) 

a;i5(C a C'NC a ) 


-71.93 -91.90 -87.76 
-124.14 -141.69 -127.43 
-179.02 -178.55 175.58 


Pro4 

4 (C*NC a C') 

V> 4 (NC a C'N) 

w 4 (C Q C'NCo) 


-26.40 -27.83 -19.11 
174.92 175.71 173.02 
179.99 160.11 151.71 


Proio 

4>io(C'NC a C') 
V>io(NC a C'N) 
wio(C«C'NC a ) 


-18.85 -18.14 -32.86 
-179.25 171.95 -174.97 
-173.67 163.73 174.71 


Proie 

^'(C^NCcC) 

</>i 6 (NCoC'N) 

wi 6 (C«C'NC a ) 


-16.67 -35.40 -17.54 
-174.59 -164.62 174.74 
-160.20 -154.66 -166.38 


e4,l(NCaC^C 7 ) 
?4,2(C a C^C 7 C5) 

e 4 ,3(C^C 7 C 5 N) 
£ 4 , 4 (C 7 C 5 NC a ) 


9.90 1.50 -23.20 
-31.60 23.95 2.67 
41.15 37.37 18.97 
-36.76 -38.23 -34.40 


€io,i(NC a C^C 7 ) 
€io,2(C a C^C 7 Ca) 
5io, 3 (C^C 7 C,sN) 
€io,4(C 7 CjNC a ) 


8.49 -2.46 17.61 
-29.91 -21.03 -36.90 
39.72 36.68 41.40 
-36.38 -39.80 -32.39 


66,i(NC a C^C 7 ) 

?16,2(C a C^C 7 Ci) 

5i6, 3 (C^C 7 C 5 N) 
€i6,4(C 7 C^NC C( ) 


8.94 20.80 11.72 
-29.67 -38.56 -33.80 
39.10 41.49 43.06 
-34.95 -29.84 -37.43 


Pro 5 

5 (C'NC a C') 
</> B (NC a C'N) 
w s (C a C'NC Q ) 


-5.91 22.34 15.11 
155.36 148.26 139.95 
178.86 178.26 179.68 


Proio 

4>io(C'NC a C') 
V>io(NC a C'N) 
a;io(C a C'NC a ) 


18.57 12.69 12.57 
138.68 135.57 146.41 
-176.07 171.00 -177.61 


Proi7 

<£l 7 (C'NC a C) 

V>i 7 (NC a C'N) 
cji 7 (C«C'NC<,) 


11.31 17.88 12.28 
143.24 138.64 137.55 
171.00 172.50 -177.94 


( 5 ,i(NC Q C ? C 7 ) 
?5,2(C a C^C 7 Ca) 
6,3(C^C 7 C 5 N) 
6,4(C 7 QNC a ) 


-48.80 -43.13 -45.73 
44.97 20.51 27.71 

-25.77 9.32 -0.10 
-4.44 -36.41 -28.61 


eio,i(NC a C^C 7 ) 
fio,2(C a C^C 7 C{) 
5io, 3 (C^C 7 CfN) 
?io,4(C 7 CjNC a ) 


-42.22 -48.19 -46.91 
20.35 31.03 29.91 
8.19 -3.29 -3.04 
-33.90 -26.80 -25.99 


67,i(NC Q C^C 7 ) 

§17,2(00,0^0^.0^) 

?i7, 3 (C^C 7 C 5 N) 

§17,4(C 7 C^NCo) 


-40.07 -42.40 -45.58 
28.09 27.34 31.62 
-6.91 -3.47 -7.57 

-17.93 -22.79 -20.46 


Gly 6 

</>6(C'NC a C') 
V> 6 (NC Q C'N) 
w 6 (C«C'NC a ) 


-82.75 -88.83 -84.60 
-127.38 -136.72 -123.46 
-171.50 -171.41 -179.43 


Glyi2 

<^i2(C'NC a C') 

V>i 2 (NC a C01) 

wi 2 (C a C'NC«) 10 


-88.71 -74.84 -100.12 
-138.87 -127.36 -115.61 
-165.94 179.81 178.98 


Glyis 

^18(C'NC a C) 

^ 18 (NCoC01) 
i/.i 8 (NCoC02) 


-70.80 -98.45 -89.04 
7.74 38.56 24.69 
-172.34 -141.93 -156.42 



